Problem statement

To classify a given silhouette as one of four types of vehicle, using a set of features extracted from the silhouette. Thevehicle may be viewed from one of many different angles

Description of Dataset

The data contains features extracted from the silhouette of vehicles in different angles. Four "Corgie" model vehicles were used for the experiment: a double decker bus, Cheverolet van, Saab 9000 and an Opel Manta 400 cars. This particular combination of vehicles was chosen with the expectation that the bus, van and either one of the cars would be readily distinguishable, but it would be more difficult to distinguish between the cars

Vehicle Classification

The purpose of the case study is to classify a given silhouette as one of four different types of vehicle, using a set of features extracted from the silhouette. The vehicle may be viewed from one of many different angles. Four "Corgie" model vehicles were used for the experiment: a double decker bus, Cheverolet van, Saab 9000 and an Opel Manta 400 cars. This particular combination of vehicles was chosen with the expectation that the bus, van and either one of the cars would be readily distinguishable, but it would be more difficult to distinguish between the cars. The purpose is to classify a given silhouette as one of three types of vehicle, using a set of features extracted from the silhouette. The vehicle may be viewed from one of many different angles.

Data Dictionary

Importing Required Modules

Read the dataset

Check for Missing Data

Exploratory Data Analysis

Data Preprocessing (Understanding the data)

Understanding the outier using different plots

Inference
It is showing that there are some columns which contains outliers such as radius_ratio, pr.axis_aspect_ratio, max.length_aspect_ratio, scaled_variance, scaled_variance.1, skewness_about, skewness_about.1.

Inference
The above count plot shows the count of all the four different vehicles

Inference
The above histogram shows the comparision of all the columns in the dataset.

Inference
The countplot compares between skewness_about and class.

Inference

Inference

Inference

Apply different Classification Algorithms and tune them

Split Data for Training and Testing

Logistic Regression Model

Training and Predicting

Predicting

Evaluation

Heapmap for Confusion Matrix

Classification Report

Accuracy

Creating DataFame to roughly compare actual outputs and predicted outputs

Decision Tree

Training and Predicting

Predicting

Model Evaluation

Creating DataFame to roughly compare actual outputs and predicted outputs

Visualising Decision Tree

Support Vector Classifier (SVM)

Predicting

Model Evaluation

Creating DataFame to roughly compare actual outputs and predicted outputs

Hyper Parameter Tuning using GridSearch

Prediction using tuned model

K Means

Elbow Plot

Initializing K-Means

Getting clusters

Getting clusters of final cluster

Scatter plot as per the clustered

Get performance metrics for all the applied classifiers

Visually compare the performance of all classifiers

Best possible tuning from all the classifiers

Comparing all the models, we conclude that Logistic Regression, Decision Tree Classifier and SVM gives better results when compared to SVM (before tuning) and KMeans.

So we will use either of the above models to predict the silhouette as one of the four types of vehicles.

Future scope of improvement in the algorithm

This dataset deals with the classification of vehicles, we can further develop this by using Open CV and Deep Learning which is one of the easiest possible ways to identify the type of vehicle.